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Abstract — Data mining has various applications for 
customer relationship management. In this proposal, we 
are introducing a framework for identifying appropriate 
data mining techniques for various CRM activities. This 
Research attempts to integrate the data mining and CRM 
models and to propose a new model of Data mining for 
CRM. The new model specifies which types of data mining 
processes are suitable for which stages/processes of CRM. 
In order to develop an integrated model it is important to 
understand the existing Data mining and CRM models. 
Hence the article discusses some of the existing data 
mining and CRM models and finally proposes an 
integrated model of data mining for CRM. 
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I. Introduction 

Value Creation for the customer is the key determinant of a 
successful business. Customer satisfaction ensures profitability 
for businesses in the long run. Customer bases built over a 
period of time proved to be of immense help in increasing the 
reach of a particular business's product or service. However, 
the recent increase in the operating costs of business made it 
more compelling for businesses to increase loyalty among 
existing customers while trying to attract new ones. The 
processes by which an organization creates value for the 
customer, is often referred to as Customer Relationship 
Management (CRM) [1]. 

According to Microsoft, CRM is "a customer-focused 
business strategy designed to optimize revenue, profitability, 
and customer loyalty. By implementing a CRM strategy, an 
organization can improve the business processes and 
technology solutions around selling, marketing, and servicing 
functions across all customer touch-points (for example: Web, 
e-mail, phone, fax, in-person)". The overall objective of CRM 
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applications is to attract, retain and manage a firm's profitable 
("right") customers [1]. 

Business intelligence for CRM applications provides a firm 
with actionable information from the analysis and 
interpretation of vast quantities of customer/market related 
data. Databases for business intelligence include cus-tomer 
demographics, buying histories, cross-sales, service calls, 
website navigation experiences and online transac-tions. 
Through the appropriate use of analytical methods and 
software, a firm is able to turn data into information that leads 
to greater insight and development of fact-based strategies 
which in turn helps the firm gain competitive advantage by 
creating greater value for the customer [1]. 

Analogous to traditional mining, which involves searching 
for an ore in a mountain, data mining involves searching for 
valuable information in large databases. Both these processes 
involve either groping through a vast amount of material or 
intelligently probing the data to find the true value that lies 
hidden in data. Data mining involves not only the extraction of 
previously unknown information from a database but also the 
discovery of relation-ships that did not surface in the previous 
methods of data analysis. The "jewels" discovered from the 
data mining process include these non-intuitive hidden 
predictive relationships between variables that explain 
customer behavior and preferences. The predictive capabilities 
of data mining enable the businesses to make proactive, 
knowledge-driven decisions. Data mining tools facilitate 
prospective analysis, which is an improvement over the 
analysis of past events provided by the retrospective tools. The 
emergence of large data warehouses and the availability of data 
mining software is creating opportunities for businesses to find 
innovative ways to implement effective customer relationship 
strategies [1]. 

The automation of data collection and the relative decrease 
in the costs of operating huge data warehouses has made 
customer data more accessible than ever. The analysis of data, 
which until a few years ago was associated with high-end 
computing power and algorithms decipherable by only 
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professional statisticians, is increasing to become more popular 
with user-friendly tools available on desktops [Berger, 1999 
#2]. Data mining plays an important role in the analytical 
phases of the CRM life cycle as well as the CRM process [1]. 

n. Research methodology 

As the nature of research in CRM and data mining are 
difficult to confine to specific disciplines, the relevant 
materials are scattered across various journals. Business 
intelligence and knowledge discovery are the most common 
academic discipline for data mining research in CRM. 
Consequently, the following online journal databases were 
searched to provide a comprehensive bibliography of the 
academic literature on CRM and Data Mining: 

> ABI/INFORM Database; 

> Academic Search Premier; 

> Business Source Premier; 

> Emerald Fulltext; 

> Ingenta Journals; 

> Science Direct; and 

> IEEE Transaction. 

The literature search was based on the descriptor, 
"customer relationship management" and "data mining", 
which originally produced approximately 900 articles. The full 
text of each article was reviewed to eliminate those that were 
not actually related to application of data mining techniques in 
CRM. The selection criteria were as follows: 

• Only those articles that had been published in business 
intelligence, knowledge discovery or customer 
management related journals were selected, as these 
were the most appropriate outlets for data mining in 
CRM research and the focus of this review. 

• Only those articles which clearly described how the 
mentioned data mining technique(s) could be applied 
and assisted in CRM strategies were selected. 

• Conference papers, masters and doctoral dissertations, 
textbooks and unpublished working papers were 
excluded, as academics and practitioners alike most 
often use journals to acquire information and 
disseminate new findings. Thus, journals represent the 
highest level of research. Each article was carefully 
reviewed and separately classified according to the 
four categories of CRM dimensions and seven 
categories of data mining models. Although this search 
was not exhaustive, it serves as a comprehensive base 
for an understanding of data mining research in CRM. 

III. Research challenges 

A. Data Mining Challenges & Opportunities in CRM 

In this section, we build upon our discussion of CRM and 
Life Sciences to identify key data mining challenges and 
opportunities in these application domains. The following is a 
list of challenges for CRM [2]. 

1 ) Non-trivial results almost always need a combination of 
DM techniques. 



Chaining/composition of DM, and more generally data 
analysis, operations is important. In order to analyze CRM 
data, one needs to explore the data from different angles and 
look at its different aspects. This should require application of 
different types of DM techniques and their application to 
different "slices" of data in an interactive and iterative fashion. 
Hence, the need to use various DM operators and combine 
(chain) them into a single "exploration plan" [2]. 

2) There is a strong requirement for data integration 
before data mining. 

In both cases, data comes from multiple sources. For 
example in CRM, data needed may come from different 
departments of an organization. Since many interesting patterns 
span multiple data sources, there is a need to integrate these 
data before an actual data mining exploration can start [2] . 

3) Diverse data types are often encountered. 

This requires the integrated mining of diverse and 
heterogeneous data. In CRM, while dealing with this issue is 
not critical, it is nonetheless important. Customer data comes in 
the form of structured records of different data types (e.g., 
demographic data), temporal data (e.g., weblogs), text (e.g., 
emails, consumer reviews, blogs and chat-room data), 
(sometimes) audio (e.g., recorded phone conversations of 
service reps with customers) [2]. 

4) Highly and unavoidably noisy data must be dealt with. 
In CRM, weblog data has a lot of "noise" (due to crawlers, 

missed hits because of the caching problem, etc.). Other data 
pertaining to customer "touch points" has the usual cleaning 
problems seen in any business-related data [2]. 

5) Privacy and confidentiality considerations for data and 
analysis results 

They are a major issue. In CRM, lots of demographic data 
is highly confidential, as are email and phone logs. Concern 
about inference capabilities makes other forms of data sensitive 
as well — e.g., someone can recover personally identifiable 
information (PII) from web logs [2]. 

6) Legal considerations influence what data is available 
for mining and what actions are permissible. 

In some countries it is not allowed to combine data from 
different sources or to use it for purposes different from those 
for which they have been collected. For instance, it may be 
allowed to use an external rating about credit worthiness of a 
customer for credit risk evaluation but not for other purposes. 
Ownership of data can be unclear, depending on the details of 
how and why it was collected, and whether the collecting 
organization changes hands [2]. 

7) Real-world validation of results is essential for 
acceptance. 

In CRM, as in many DM applications, discovered patterns 
are often treated as hypotheses that need to be tested on new 
data using rigorous statistical tests for the actual acceptance of 
the results. This is even more so for taking or recommending 
actions, especially in such high-risk applications as in the 
financial and medical domains. Example: recommending 
investments to customers (it is actually illegal in the US to let 
software give investment advice) [2,3]. 

8) Developing deeper models of customer behavior: 
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One of the key issues in CRM is how to understand 
customers. Current models of customers mainly built based on 
their purchase patterns and click patterns at web sites. Such 
models are very shallow and do not have a deep understanding 
of customers and their individual circumstances. Thus, many 
predictions and actions about customers are wrong. It is 
suggested that information from all customer touch-points be 
considered in building customer models. Marketing and 
psychology researchers should also be involved in this effort. 
Two specific issues need to be considered here. First, what 
level should the customer model be built at, namely at the 
aggregate level, the segment level, or at the individual level? 
The deciding factor is how personalized the CRM effort needs 
to be. Second is the issue of the dimensions to be considered in 
the customer profile. These include demographic, 
psychographic, macro-behavior (buying, etc.), and micro- 
behavior (detailed actions in a store, e.g. individual clicks in an 
online store) features [2,3]. 

9) Acquiring data for deeper understanding in a non- 
intrusive, low-cost, high accuracy manner: 

In many industrial settings, collecting data for CRM is still 
a problem. Some methods are intrusive and costly. Datasets 
collected are very noisy and in different formats and reside in 
different departments of an organization. Solving these pre- 
requisite problems is essential for data mining applications [2]. 

10) Managing the "cold start/bootstrap" problem: 

At the beginning of the customer life cycle little is known, 
but the list of customers and the amount of information known 
for each customer increases over time. In most cases, a 
minimum amount of information is required for achieving 
acceptable results (for instance, product recommendations 
computed through collaborative filtering require a purchasing 
history of the customer). Being able to deal with cases where 
less than this required minimum is known is a therefore a 
major challenge [2]. 

11) Evaluation framework for distinguishing between 
correct/incorrect customer understanding: 

Apart from the difficulty of building customer models, 
evaluating them is also a major task. There is still no 
satisfactory metric that can tell whether one model is better 
than another and whether a model really reflects customer 
behaviors. Although there are some metrics for measuring 
quality of customer models (e.g., there are several metrics for 
measuring the quality of recommendations), they are quite 
rudimentary, and there is a strong need to work on better 
measures. Specifically, the recommender systems community 
has explored this area [2,3,6,7]. 

In the fig. 1 we can see how Data mining stages used with 
all CRM lifecycle. 

• Association rule; 

• Decision tree; 

• Genetic algorithm; 

• Neural networks; 

• K-Nearest neighbour; 

• Linear/logistic regression 




Figure 1 . Classification framework for data mining techniques in CRM. 

A graphical classification framework on data mining 
techniques in CRM is proposed and shown in Fig. 1; it is based 
on a review of the literature on data mining techniques in 
CRM. Critically reviewing the literature on data mining in 
CRM helped to identify the major CRM dimensions and data 
mining techniques for the application of data mining 
techniques in CRM. It describes CRM dimensions as: 
Customer Identification, Customer Attraction, Customer 
Retention and Customer Development. In addition, described 
the types of data mining model as Association, Classification, 
Clustering, Forecasting, Regression, Sequence Discovery and 
Visualization. We provide a brief description of these four 
dimensions and some references for further details, and each of 
them is discussed in the following sections [6]. 

IV. Classification framework - CRM Dimensions 

In this study, CRM is defined as helping organizations to 
better discriminate and more effectively allocate resources to 
the most profitable group of customers through the cycle of 
customer identification, customer attraction, customer retention 
and customer. 

(i) Customer identification: CRM begins with customer 
identification, which is referred to as customer acquisition in 
some articles. This phase involves targeting the population 
who are most likely to become customers or most profitable to 
the company. Moreover, it involves analyzing customers who 
are being lost to the competition and how they can be won 
back . Elements for customer identification include target 
customer analysis and customer segmentation. Target 
customer analysis involves seeking the profitable segments of 
customers through analysis of customers' underlying 
characteristics, whereas customer segmentation involves the 
subdivision of an entire customer base into smaller customer 
groups or segments, consisting of customers who are relatively 
similar within each specific segment. 
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(ii) Customer attraction: This is the phase following customer 
identification. After identifying the segments of potential 
customers, organizations can direct effort and resources into 
attracting the target customer segments. An element of 
customer attraction is direct marketing. Direct marketing is a 
promotion process which motivates customers to place orders 
through various channels. For instance, direct mail or coupon 
distribution are typical examples of direct marketing. 

(Hi) Customer retention: This is the central concern for CRM. 
Customer satisfaction, which refers to the comparison of 
customers' expectations with his or her perception of being 
satisfied, is the essential condition for retaining customers . As 
such, elements of customer retention include one-to-one 
marketing, loyalty programs and complaints management. 
One-to-one marketing refers to personalized marketing 
campaigns which are supported by analysing, detecting and 
predicting changes in customer behaviours Thus, customer 
profiling, recommender systems or replenishment systems are 
related to one-to-one marketing. Loyalty programs involve 
campaigns or supporting activities which aim at maintaining a 
long term relationship with customers. Specifically, churn 
analysis, credit scoring, service quality or satisfaction form 
part of loyalty programs. 

(iv) Customer development: This involves consistent 
expansion of transaction intensity, transaction value and 
individual customer profitability. Elements of customer 
development include customer lifetime value analysis, 
up/cross selling and market basket analysis. Customer lifetime 
value analysis is defined as the prediction of the total net 
income a company can expect from a customer. Up/Cross 
selling refers to promotion activities which aim at augmenting 
the number of associated or closely related services that a 
customer uses within a firm. Market basket analysis aims at 
maximizing the customer transaction intensity and value by 
revealing regularities in the purchase behaviour of customers. 

A. Good actioning mechanisms: 

Once data mining has been conducted with promising 
results, how to use them in the daily performance task is 
critical and it requires significant research effort. It is common 
that after some data results are obtained, the domain users do 
not know how to use them in their daily work. This research 
may require the participation of business and marketing 
researchers. Another way to accommodate actioning 
mechanisms is to integrate them into the knowledge discovery 
process by focusing on the discoveries of actionable patterns in 
customer data. This would make easier for the marketers or 
other domain experts to determine which actions should be 
taken once the customer patterns are discovered [2]. 

B. Incorporating prior knowledge: 

This has always been a problem in practice. Data mining 
tends to find many pieces of patterns that are already known or 
redundant. Incorporating prior domain knowledge can help to 
solve these problems, and also to discover something novel. 
However, the difficulties of incorporating domain knowledge 
result in little progress in the past. There are a number of 
reasons for this. First of all, knowledge acquisition from 
domain experts is very hard. This is well documented in AI 



research, especially in the literature of expert systems building. 
Domain experts may know a lot but are unable to tell. Also, 
many times, domain experts are not sure what the relevant 
domain knowledge is, which can be very wide, although the 
data mining application itself is very narrow. Only after 
domain experts have seen some discovered patterns then they 
remember some domain knowledge. The second reason is the 
algorithmic issue. Many existing methods have difficulty to 
incorporate sophisticated domain knowledge in the mining 
algorithm. Also, once the new patterns are discovered, it is 
important to develop methods that integrate the newly 
discovered knowledge with the previous knowledge thus 
enhancing the overall knowledge base. Although there is some 
general work on knowledge enhancement, much more needs to 
be done to advance this area and adapt it to CRM problems. 
Also, integration of these methods with existing and novel 
Knowledge Management approaches constitutes a fruitful area 
of research [2]. 

Customer relationship management in its broadest sense 
simply means managing all customer interactions. In practice, 
this requires using information about your customers and 
prospects to more effectively interact with your customers in 
all stages of your relationship with them. We refer to these 
stages as the customer life cycle. 

The customer life cycle has three stages: 

1 . Acquiring customers 

2. Increasing the value of customers 

3. Retaining good customers 

Data mining can improve your profitability in each of these 
stages when you integrate it with operational CRM systems or 
implement it as independent applications [4] . 

C. Acquiring new customers via data mining [4] 

The first step in CRM is to identify prospects and convert 
them to customers. Let's look at how data mining can help 
manage the costs and improve the effectiveness of a customer 
acquisition campaign. Big Bank and Credit Card Company 
(BB&CC) annually conducts 25 direct mail campaigns, each of 
which offers one million people the opportunity to apply for a 
credit card. The conversion rate measures the proportion of 
people who become credit card customers, which is about one 
percent per campaign for BB&CC. Getting people to fill out an 
application for the credit card is only the first step. Then, 
BB&CC must decide if the applicant is a good risk and accept 
them as a customer or decline the application. Not surprisingly, 
poor credit risks are more likely to accept the offer than are 
good credit risks. So while six percent of the people on the 
mailing list respond with an application, only about 16 percent 
of those are suitable credit risks; approximately one percent of 
the people on the mailing list become customers. BB&CC's six 
percent response rate means that only 60,000 people out of one 
million names respond to the solicitation. Unless BB&CC 
changes the nature of the solicitation - using different mailing 
lists, reaching customers in different ways, altering the terms of 
the offer it is not going to receive more than 60,000 responses. 
And of those 60,000 responses, only 10,000 are good enough 
risks becoming customers. The challenge BB&CC faces is 
reaching those 10,000 people most efficiently. 
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BB&CC spends about $1.00 per piece, for a total cost of 
$1,000,000, to mail the solicitation. Over the next couple of 
years, the customers gained through this solicitation generate 
approximately $1,250,000 in profit for the bank (or about $125 
each), for a net return of $250,000 from the mailing. Data 
mining can improve this return. Although data mining won't 
precisely identify the 10,000 eventual credit card customers, 
data mining helps focus marketing efforts much more cost 
effectively. 

First, BB&CC sent a test mailing of 50,000 prospects and 
carefully analyzed the results, building a predictive model 
showing who would respond (using a decision tree) and a 
credit scoring model (using a neural net). BB&CC then 
combined these two models to find the people who were both 
good credit risks and were most likely to respond to the offer. 
BB&CC applied the model to the remaining 950,000 people in 
the mailing list, from which 700,000 people were selected for 
the mailing. The result? From the 750,000 pieces mailed 
(including the test mailing), BB&CC received 9,000 acceptable 
applications for credit cards. In other words, the response rate 
rose from one percent to 1.2 percent, a 20 percent increase. 
While the targeted mailing only reaches 9,000 of the 10,000 
prospects - no model is perfect - reaching the remaining 1,000 
prospects is not profitable. Had they mailed the other 250,000 
people on the mailing list, the cost of $250,000 would have 
resulted in another $125,000 of gross profit for a net loss of 
$125,000. 

The following table summarizes the results. 



TABLE I. Net profits from mailing 





OLD 


New 


Difference 


Number of 
pieces mailed 


$ 1,000,000 


$ 750,000 


$ 250,000 


Cost of mailing 


$ 1,000,000 


$ 750,000 


$ 250,000 


Number of 
responses 


$ 10,000 


$ 9,000 


$ 1,000 


Gross profit 
per response 


$ 125 


$ 125 


$0 


Gross Profit 


$ 1,250,000 


$ 1,125,000 


$ 125,000 


Net Profit 


$250,000 


$ 375,000 


$ 125,000 


Cost of model 


$0 


$40,000 


$40,000 


Final Profit 


$250,000 


$ 335,000 


$ 85,000 



Notice that the net profit from the mailing increased 
$125,000. Even when you include the $40,000 cost of the data 
mining software and the computer and employee resources 
used for this modeling effort, the net profit increased $85,000. 
This translates to a return on investment (ROI) for modeling of 
over 200 percent, which far exceeds BB&CC's ROI 
requirements for a project. 

D. Increasing the value of your existing customers [4] 

Cannons and Carnations (C&C) is a company that 
specializes in selling antique mortars and cannons as outdoor 
flower pots. It also offers a line of indoor flower pots made 
from large caliber antique pistols and a collection of muskets 
that have been converted to unique holders of long-stemmed 
flowers. The C&C catalog is sent to about 12 million homes. 
When a customer calls C&C to place an order, C&C identifies 
the caller using caller ID when possible; otherwise the C&C 
representative asks for a phone number or customer number 



from the catalog mailing label. Next, the representative looks 
up the customer in the database and then proceeds to take the 
order. 

C&C has an excellent chance of cross-selling, or selling the 
caller something additional. But C&C discovered that if the 
first suggestion fails and the representative suggests a second 
item, the customer might get irritated and hang up without 
ordering anything. And, there are some customers who resent 
any cross-selling attempts. Before implementing data mining, 
C&C was reluctant to cross-sell. Without a model, the odds of 
making the right recommendation were one in three. And, 
because making any recommendation is unacceptable for some 
customers, C&C wanted to be extremely sure that it never 
makes a recommendation when it should not. In a trial 
campaign, C&C had less than a one percent sales rate and 
received a substantial number of complaints. C&C was 
reluctant to continue cross-selling for such a small gain. 

The situation changed dramatically once C&C used data 
mining. Now the data mining model operates on the data. 
Using the customer information in the database and the new 
order, it tells the customer service representative what to 
recommend. C&C successfully sold an additional product to 
two percent of the customers and experienced virtually no 
complaints. Developing this capability involved a process 
similar to what was used to solve the credit card customer 
acquisition problem. As with that situation, two models were 
needed. 

The first model predicted if someone would be offended by 
additional product recommendations. C&C learned how its 
customers reacted by conducting a very short telephone survey. 
To be conservative, C&C counted anyone who declined to 
participate in the survey as someone who would find 
recommendations intrusive. Later on, to verify this assumption, 
C&C made recommendations to a small but statistically 
significant subset of those who had refused to answer the 
survey questions. To C&C's surprise, it discovered that the 
assumption was not warranted. This enabled C&C to make 
more recommendations and further increase profits. The 
second model predicted which offer would be most acceptable. 

In summary, data mining helped C&C better understand its 
customers' needs. When the data mining models were 
incorporated in a typical cross-selling CRM campaign, the 
models helped C&C increase its profitability by two percent. 

E. Increasing the value of your existing customers: [4] 
personalization via data mining 

Big Sam's Clothing (motto: "Rugged outdoor gear for city 
dwellers") developed a Web site to supplement its catalog. 
Whenever you enter Big Sam's site, the site greets you by 
displaying "Howdy Pardner!" However, once you have ordered 
or registered with Big Sam's, you are greeted by name. If you 
have a Big Sam's ordering record, Big Sam's will also tell you 
about any new products that might be of particular interest to 
you. When you look at a particular product, such as a 
waterproof parka, Big Sam's suggests other items that might 
supplement such a purchase. When Big Sam's first launched its 
site, there was no personalization. The site was just an online 
version of its catalog nicely and efficiently done but it didn't 
take advantage of the sales opportunities the Web presents. 
Data mining greatly increased Big Sam's Web site sales. 
Catalogs frequently group products by type to simplify the 
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user's task of selecting products. In an online store, however, 
the product groups may be quite different, often based on 
complementing the item under consideration. In particular, the 
site can take into account not only the item you're looking at, 
but what is in your shopping cart as well, thus leading to even 
more customized recommendations. First, Big Sam's used 
clustering to discover which products grouped together 
naturally. Some of the clusters were obvious, such as shirts and 
pants. Others were surprising, such as books about desert 
hiking and snakebite kits. They used these groupings to make 
recommendations whenever someone looked at a product. Big 
Sam's then built a customer profile to help identify customers 
who would be interested in the new products that were 
frequently added to the catalog. Big Sam's learned that steering 
people to these selected products not only resulted in 
significant incremental sales, but also solidified its customer 
relationships. Surveys established that Big Sam's was viewed 
as a trusted advisor for clothing and gear. To extend its reach 
further, Big Sam's implemented a program through which 
customers could elect to receive e-mail about new products that 
the data mining models predicted would interest them. While 
the customers viewed this as another example of proactive 
customer service, Big Sam's discovered it was a program of 
profit improvement. The personalization effort paid off for Big 
Sam's, which experienced significant, measurable increases in 
repeat sales, average number of sales per customer and average 
size of sales. 

F. Retaining good customers via data mining [4] 

For almost every company, the cost of acquiring a new 
customer exceeds the cost of keeping good customers. This 
was the challenge facing KnowService, an Internet Service 
Provider (ISP) who experiences the industry-average attrition 
rate, eight percent per month. Since KnowService has one 
million customers, this means 80,000 customers leave each 
month. The cost to replace these customers is $200 each or 
$16,000,000 - plenty of incentive to start an attrition 
management program. The first thing KnowService needed to 
do was prepare the data used to predict which customers would 
leave. KnowService needed to select the variables from its 
customer database and, perhaps, transform them. The bulk of 
KnowService's users are dial-in clients (as opposed to clients 
who are always connected through a Tl or DSL line) so 
KnowService knows how long each user was connected to the 
Web. KnowService also knows the volume of data transferred 
to and from a user's computer, the number of e mail accounts a 
user has, the number of e-mail messages sent and received 
along with the customer's service and billing history. In 
addition, KnowService has demographic data that customers 
provided at sign-up. Next, KnowService needed to identify 
who were "good" customers. This is not a data mining question 
but a business definition (such as profitability or lifetime value) 
followed by a calculation. KnowService built a model to 
profile its profitable customers and unprofitable customers. 
KnowService used this model not only for customer retention 
but to identify customers who were not yet profitable but might 
become so in the future. KnowService then built a model to 
predict which of its profitable customers would leave. As in 
most data mining problems, determining what data to use and 
how to combine existing data is much of the challenge in 
model development. For example, KnowService needed to 
look at time-series data such as the monthly usage. Rather than 



using the raw timeseries data, it smoothed the data by taking 
rolling three-month averages. KnowService also calculated the 
change in the three-month average and tried that as a predictor. 
Some of the factors that were good predictors, such as 
declining usage, were symptoms rather than causes that could 
be directly addressed. Other predictors, such as the average 
number of service calls and the change in the average number 
of service calls, were indicative of customer satisfaction 
problems worth investigating. Predicting who would churn, 
however, wasn't enough. Based on the results of the modeling, 
KnowService identified some potential programs and offers 
that it believed would entice people to stay. For example, some 
churners were exceeding even the largest amount of usage 
available for a fixed fee and were paying substantial 
incremental usage fees. KnowService offered these users a 
higher-fee service that included more bundled time. Some 
users were offered more free disk space to store personal Web 
pages. KnowService then built models that would predict 
which would be the most effective offer for a particular user. 
To summarize, the chum project made use of three models. 
One model identified likely churners, the next model picked 
the profitable potential churners worth keeping and the third 
model matched the potential churners with the most 
appropriate offer. The net result was a reduction in 
KnowService's churn rate from eight percent to 7.5 percent, 
which allowed KnowService to save $1,000,000 per month in 
customer acquisition costs. KnowService discovered that its 
data mining investment paid off - it improved customer 
relationships and dramatically increased its profitability. 

V. Conclusion 

Customer relationship management is essential to compete 
effectively in today's marketplace. The more effectively you 
can use information about your customers to meet their needs, 
the more profitable you will be. We can conclude that 
operational CRM needs analytical CRM with predictive data 
mining models at its core. The route to a successful business 
requires that you understand your customers and their 
requirements, and data mining is the essential guide [4]. 
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